DOCOMENT RESOSE 



ED 201 200 



FL 0.^ 230 



AOTIIOR 
TITLE 

INSTITOriOH 

POB DATE 
NOTE 



EDRS PRICE 
DESCfiXPIORS 

IDENTIFIERS 



Dubois, Davi.d D. 

The Children's English and Services Study: k 
Hethodological Review. 

National Center for Education Statistics (DriE'j) , 
Washington, D*C. 
Aug BO 

54p. : For related documents, see ED 1^3 971-972. 5; 
faint type, 

aF01/PC03 Plus Postage. 

Eng?Lish (Second Language): ^Evaluation Methc s: 
Language Proficiency; ^Testing 
*Childrens English and Services Study: Liai.": =d 
English Speaking 



ABSTRACT 

The Children's Englis 
project designed to assess the biiing 
English speaking children in the Onit 
draft final report proiapted the prese 
organization, in which various method 
questioned and recommendations are ma 
report. The three analytical issues i 
selected for inclusion in the Languag 
Inventory (LM6AI) selected properly? 
the LHS&I, which were determined and 
either English proficient or of limit 
set properly? and (3) what were the e 
the counts and estimates of the numbe 
to (1) ^ it is recommended that certai 
final report. Criteria are introduced 
actually revise the figures regarding 
Further investigations of nonresponse 
warranted. (JB) 



h and Services Study 
uai education needs of 
ed States, The submissi 
nt report from the spon 
ological procedures ace 
de for the revision o£ 
nvolved are: (1) were t 
e Measurement and Asses 
(2) were the cutoff sc:o 
used to classify chiiir 
ed English proficiency 
ffects of non-ressonse 
r of LEP children? iJitn 
n caveats be set iorth 
that, with respecct to 
the number of LEf^ chil 
bias were found ::ot to 



th 
he 
smc. 
res 
en ai 
(LEP' 
bias 

in t 

: 2 

are : 
be 



1 1- 



ERIC 



* Repr .actions supplied by EDBS are the best that can be made * 

* from the original document, * 



o 

o 

O 

rvj 






'■Was' 




r icdic J' dc £iCi^ 



rt 
0 



iDr, C avid D. Dubois 
' ation Policy Fcliov. 




im :,c 1980 



S Ot^ ^ARTMENT OF HE ALTH 
60 . CATION & WELFARE 
tONAL INSTITUTE OF 
EDUCATION 

r MENT HA', RF.EN WEPWO- 
:^ACTLV A'j wfCEtVED PROM 
THE PEW^^:N OR OPGANtZATiON ORlCIN- 
.-.TlNGi^ POINTS Of- VIEW OR OPINIONS 
,TATED DO NOT NE C T. S S A F? i L Y REPWp. 
,ENTOrrir!AL NATu'inAu IN'jTiTUTE Of 
-OUC-TiON POSITION OW POl.iCV 



:::: --;assi~:Lcarjim Errors in rrciion of a Crir^rion Scor^ 

th= L^n? :^'gs :":2asur=iii£ir.- ind Asssssraent .r/-'=ntory .) 

D: "alue: of ^, ?, , > . ?22 ' ^"^1 "2 -'^"^ 

tinir--:::ed ^lis:cl£^sif icatior.5 tZP Children bv A2-^ Gor.ort 57 



3 



INTRODUCTION 



On January 21 , 1930, the Office of Research and Analysis (051A) of tne :iacional 
Cencar for Education Scaciscics (NCZS), U.S. Deparr.menc of Education, issued 
a report entitled Analytical Issues Regarding the Children's English and Services 
Study (AI/CESS). The purpose of the AI/CESS report was: 

. . « to present and discuss three analytical issues which 
have been identified as a result of a post hoc assessment of 
the research design, data analyses and other inforuiation which 
are described in the 1978 Children's Englisn and Services Study 
(CESS) Draft Report of September 6 , 1979 {ar.a a Later revision 
dated November 1979).^ 

■ A copy of the AI/CESS report is found in appendix A« 

The objective of the present NCES/ORA inquiry is best sumniar ized by the follovi:-g 
passage from the January report: 

Since the results of the 1978 C^SS ^re of tremendous 
importance to present and f utur e research sz udies , 
bilingual (education) program and policy deveioptnant , 
and funding for bilingtial education, unresoT^ed 
analytical issues which could adversely affect the 
validity of the results are being stated witn the 
hope of their resolution.^ 

It is generally recognized that secondary analyses of cata and research designs 
frequently reveal analysis errors or areas of skepticism in the design. 
Sterling and Weinkam (1979), who discovered misclassif ications in a 
study of mortality among U. S. veterans, Ucacribe the potential response 
of managers to this discovery as either "cooperative" or '^adversary." 

In the former case, an attempt is made to determine ths source of concern and 
to restructure the procedures or analyses^ In tnc latter case, attempts are 
■\acr to eliminate the discovery of errors rather than their source. 

h>^:''\iing this problem, Sterling and Weinkan further observed that: 

... there may be underlying sociological and psychological 
forces operating which make it more acceptable for manage- 
ment to adopt an adversary rather than a cooperative stance 
even in scientific instances. From a simpleminded perspec- 
tive, to acknowledge the existence of errors may require 
considerable effort and expenditures to correct thera, not 
to say anything of extracting accountability from some 
individuals who insist on bringing these errors to the 
attention of management as troublemakers.^ 



ERIC 



They continued by indicating :hat "as the -aiua 
for secondary analysis beccme increasingly clear 
value in the use of properly collected an sui:.: 
other discoveries similar to ours vill b-^ 
uiechanisra be established for the encour az' i:i.int or 
validation of the appropriateness or prec^amg vr:r 
the results of tr-s -. anal vses. 



.lb 1^: daza files 
a ^rsa: deal o: 

-y :pe:t tha: 
_i ec : h -2 1 a 

. '.. 7 5 _3 :or the 
:r i. u: von o f 



NCE3/0RA recogni-.-.i that nanv research are . ' ^m-r-a 

one or niore ways ._nd the CESS Drsft Rapor: .:ntior. 'r.i: 

Accordingly, NCEZ TRA's purpose ^n this i:^ rt is ' . 

soecific recoimendacions for [iioc:-f ication:: the 



. .curat e 



::ce: 



,on . 



_es and ■ make 



Tnis report includes a discus si or. of tne nr. e i 
report and vill :e based, in part, upon thr asp n^ .i vt: 
In addition, pert:L:ient literature on iangua:_ , de *2: ^:n" r. 
relationship of language, acquisition to co^r.itior , i-- 
coaplsted by NCES provide the bases for thii pap^r 

Responses to the Al/CESS report by those who wer^ ..■.ivr^ir ec .. 
for examination at the Office of Research and Ar ?.i ' z: 
will be retained on file for a period of one ca -3:: >• r : 
of this report . 



. m :r.e AI/CESS 

•ed fr :m it , 
assessment, tne 
nal data analyses 



res pen: 
NCES, 

. L o w i n . 



are available 
These responses 
tne publication 



STATZMENT OF THE ISSUES 



:3l lowing anal* 



.zal issues .ire ■- - e subject: :f c\iis p-- ^icion paper: 



1, Were che i:i-is whicr. we:*: 
MeasureTiient ind Assiissme : 

Im Were che cut ff sccres : i 
used CO clas . ify children, 
limited Engl ..sh proficis : 

3. Whac vers zr.-.e effects of - 
estimaCcS o- che number 



;: leer. ad for inclusion m Che Ldagucige 
en I or y ( -Mei AI ) select ad pro per 1. y ? 

. LMSAI. -aich were c etcr-jiimtid ar.d 
L ::icher n^Iish proficient or of 
■ .l.EP^ : properly? 

•-r?'t.' : ias on th-i counco and 



DIS3USSI0N OF THE 



AN" F.ZCOMMENDATIONS 



ISSUE: Were che icems vhic'r 
for inclusion in che Lar^gua^;- 
Assessmenc Invencory (Lh£:Al' 



e 5. _Iecced 
-easuTemenc and 
Aezz -d properly? 



This issue was rescaced as 



lis Eiiglish language profi 
on which che scores vary, 
sions associated wich variations 



1 .ncy the dimension 
•": r are ocher dimen- 



che scores 



.5 



Two subissues were posed, namely: 

9 Are che test scores r Iw^.ced Cc language domin,ince? 

« Are the cesc scores r „ated to general language deveIopn>enc? 



Discussion Lourdes Miranda^ Presi_ 
contraccor for che CESS, responded 
racionale for the cest items select 
essencial for us to measure the ab.. 
cessfully deal with academic classr 
cn memory and cognitive abilities a 
"other dimensions [e.g., cognitive] 



^nt of L» Miranda and Associates, the prime 
:d the AI/CESS report. In discussing the 
£d for the LM&AI, Miranda noted that "it was 
^ity [of language minority children] to sue- 
^-Tni tasks that are often as clearly reliant 
)n English language skills."^ Therefore, 



associated with score variation, 



h7 



The L2*1&AI was specifically designed -^t*' aeet the definition of limited English 
proficiency found in the Bilingual Ecucation Act, that is, the 1965 Elementary 
and Secondary Education Act (ESEA), S^^rtion 703(a)(1)(B), as amended. The 1978 
Amendment of the Act expanded the lanr-:jage skill domains to include speaking, 
reading, writing and understanding ti^ English language. By virtue of their 



EKLC 



S 



lizi:: English proficierLiv ^ Congress :oncl — , anguage Ti:.r. - -icv ch;.icv::a 

wer : 1 2d C 0 p po :* - ur. ^ : 7 Co a c C a m l 2 v s . . -ir 2 3 1 e c c or : r s a c i .1 r_ :.r 
appr -i/.-'iaca age and zra:.'z levels, 

J. '.a^l O'Mallsy, ire :iZ Projecr Office;; :Z£S , r^^r.Qr. :A zo :ae :ir: 
iss .iC : 

Because fur ::icr;ing in rr.e class: ofcer. require 

conceptual :ills as veil as oral . r.^uage and lir '^cy^ 

the inclusi ^f cogninive ie:?. and the teat ir 

va s seen a .•- in accernaols acoroa'" r mareasani; .e 

concent an:: perhaps the predicriv vi^iidizy of esc,^ 

0 ' Ma .a; 30 stated that: 

A "pur e'^ jisasure of English profi.i - ncy c ould no c ive 
possessed the content validity re::_:.red no identi: 
language uiinority children '^-ho ha* iifficulry pra itmg 
from 



instruction i:? English, ^ 



car i. ler 
pr edic 
deterr. 
s ions 
level 
alone 
the dz 
could I' 



nis resDO" 



3 , 0' Mai lev sai. 



12a: 



stated 

1 



of the aDiiaty co prorit rrora ing^rsn language 
eligibilitv for ESEA Title VII,"10:':e obaer-ved rha 
.z eligibiliay for ESEA Title 711 are ofcen based c 
functioning in the classroom rather chan on English 
And later, "The LM.SAI used tasted skills in English 
ions schools would make in deteraining that languag 
: orofit from instruction in English,'' i2 



St scores are 
: ruction, vhich 
, "School deci- 
a child" s general 
language pro fic iency 
, , . to simulate 
: minor it V students 



A rev^ of recent literature in the areas of language assearmenc, linguistic 
ard i:.:-ellectual development, and bilingual education programs revealed that 
analy ical questions in these areas have, for some time, presented serious 
innel .ectual ch^tilenges to researchers and educators. The issues raised m 
the AI/CESS report were presented within the framework of the CESS development: 
process and with the knowledge that there are many unanswered basic research 



questions in the three areas mentioned above, 
is to clarify current thought on this issue. 



NCES ourpose in this reoort 



The first subissue raised in the Al/CESS report was f.tated as: "Are the scores 
related to language domiaance?"^^ q ' Malley takes the following position: 



3y exclusion^in the [legislative] definition of 
eiigibilityg language dominance has no role in policy 
determination for ESEA Title VII eligibility. Thus, 
Che statements in the NIE report on the CESS that 
lan^uase dominance was considered irrelevant is 
under s t and able .-^^ 



4 



NCES/C?Jl believes there is _;usr__fic^ 
concepr in Che developmenc of :: :e I: 
be agreemeuc among linguists a: :o : 



moacc oj 



'language dominariC^J 



rion for excluding a language dcminance 
,53, Specificallv, there does not appea r to 
.r. operational derinition and , chererore, the 
zhe ability : .- language minority cnildren to 



ororit rrom instruction 



-na 



DeAvila and Duncan ( 1976 ) argue 
discussing school achievement, 
dominance clr-.rify the relation 
school achievement in sucLi a va- 
They continue by sayings "Anotihi: 
whether or not 'dominance' i.n c~ 
or what can be learned ."^^ Lang-:^- 
that the child might have langi_: 
native language and English. "i- 



12 imst using c 
7:. ay say ^ "how 
i-veen the chil 

nan we can dc 
• way of as'r-inz 

: itself zetE 
-mmance "c 
; : -veloDment 



'-anguage dominance" concept vher 
: : the cone ept of [ 1 anguage ] 
^' ^ linguistic development and 

i'T,e:hing about it?"15 

question is by asking 
-zir ::s whether what is^ learned 
£ r^ut address the real issue 
:r c D I ems in both 1 ansua^es — t he 



Some experts have argued that ^ language dcminance concept is meaningful only 
when the use of a language is ror.sidered within a social or cultural context, 
such as: home and family relaL-'. r'.sr.ips j social interactions, an academic domain, 
a business environment, or wi" ^-.i:. a religious cnnr.ext. The degree of fluency or 
level or language dominance i x£.aningful only vnen the purpose for which language 
is being used is also stated, Ir. tnis sense, i^veral "dominance" levels might be 
defined , 



Regarding a child's possible -liffiiulty w-ith bcin Languages, Dubois ( 1980) states: 
"vvliether it is appropriate t- assess English language proficiency, ignoring the 
child's proficiency m another language remain- a policy question to be addressed ,"13 
More specifically, this is ar. empirical question, 

A lecent article by Cummins 1979' addresses tnis question. In the following ex- 
cerpt Lj refers to a child's firs" language and L2 refers to the second language, 
Cummins say 3 : 



The lack of concern for tne developmental interrelationships 
between language and thought in the bilingual child is one 
of the maj or reasons why evai ua tions and research have provi- 
ded so little data on the dynamics of the bilingual child's 
interaction with his educational eiivironment . A direct 
determinant of the quality of this interaction is clearly 
the level of Lj and Lj competence which the bilingual child 
develops over the course of his school career, .,.>nnat level 
of L2 competence must the child possess at various grade 
levels in order to benefit optimally from ins true t ion in that 
language? ,,,To what extent are Lj and skills interdepen- 
dent and what are the implications of possible interdependencies 
for cognitive and academic progress? In other words, do children 
who maintain and develop their L-|^ in school develop higher or. 
lower L-% levels of skills than those whose is reolaced bv 
cheir L2?19 ^ 



ERIC 



' 5 



Cumnins provides research evidence for a "developmencal incerdependence hypochesLs" 
which says chat zhe level of L2 coapecence which a bilingual child acta ins is 
partially a function of the type of competence the child has developed in at 
the cine when intensive exposure to L2 begins. In this sense, a nieasure or 
proficiency is important for policy decisions, 

W, E. Laabert (1975) suggested that children exhibit either "additive b il ingualisa" 
or ''subtr active bilingualism-" A child's bilingual ism is niost likely to be ''additive" 
when Lj is prestigious or the "dominant" language and is, therefore, not in danger 
of being replaced by L2* In this case, a bilingual cnild "adds" skills without 
a loss of Lj skills. "Subtractive bilingualism" refers to zhe forni of b ii inr^ualisn 
children experience when their Lj is eventually replaced by Lj- This is generally 
true when the child's Lj is a nonpr estigious or a minority language. Socioeconomic 
status also seems to be a factor which is related to whether a child's bilingualism 
is subtractive or additive. Children from upper or middle class socioeconomic strata, 
when given instruction in tend to experience "additive bilingualism" while children 

from lower socioeconomic strata tend to experience "subtractive bilingualism." Troike 
(1980) hypothesizes that, for children from lower socioeconomic groups, a child's cog- 
nitive development can become disrupted when a child begins learning L2 between the 
ages of 6-10. Socioeconomic status and socio-political status are, therefore, related 
to language and cognitive developrnf^t - 

The second subissue stated in the AI/CSSS report, was: "Are the test scores related 
to general language developjment ?"^^ The concern was for the inclusion of test items 
on the LM&AI which included cognitive components. Miranda noted earlier that the 
purpose of the LM&AI was to measure the ability of language minority children to suc- 
cessfully deal with academic classroom language skills. In reply to this subissue, 
Miranda stated that, "it is difficult to imagine how a test of 'pure' linguistic 
competence could have been developed should we have been asked to do so. "21 

DeAvila, et al. (1979) observed that "much confusion abounds with respect to 
both the meaning and the measurement of English language prof iciency."^^ 
Moreover, they noted that "the rcl:; or language and cognition in general is 
itself not clearly agreed upon,"23 ^or the purpose of this discussion, cognition 
shall mean the act or process of perceiving or knowing. 

Cazden (1972) addressed two controversial items of interest in Child Language 
and Education. The first item concerns whether a person's thought is affected 
by the particular language forms or speech patterns with which they are 
familiar. The second item concerns the question of which develops first, the 
nonverbal idea or the words to express it.^^ Essentially, this poses the 
central issue: Which develops first, language or cognition? Language experts, 
educational psychologists and professionals in related fields apparently do not 
agree upon the proposed answers to this question. 

Cazden' s first item is based upon the Wnorfian (1956) hypothesis which says 
that "language influences our perceptions of and responses to the world." ^5 
'This leads us to believe that no learning can take place until language 
proficiency is attained; therefore, language determines cognition. Regarding 
Cazden' s second item, Jean Piaget indicates that it is a child's cognitive 



development which is the primary factor in language acquisition and development 
vith a later emphasis on a niore balanced interaction betvean tne two. 

Piaget's position is that cognicion develops as a result of experience.^ 

He believes that although language contributes to further development, it is 

the use of language that is determined by development and not the converse. 

Cummins (1979), in a summary of research evidence on the role of language and 
cognitive development, was led to conclude: 

that the level of ccmDetence bilingual children achieve 
in their tvo languages acts as an intervening variable in 
mediating the effects of their bilingual learning experi- 
ences on cognition. Specifically, there may be threshold 
levels of linguistic competence which bilingual children 
must attain both in order ro avoid cognitive deficits and 
to allow the potentially beneficial aspects of^becoming 
bilingual to influence their cognitive growth. 

DeAvila, et al. (1979) stated that: "Edmonds (1976) has recently argued that 
a full understanding of language acquisition will not emerge until the process 
is viewed, within a larger developmental framework. "28 And, related to this, 
"Tremaine (1975) has examined 'syntax as an instance of operational intelligence' 
defined in the Piagetian sense. The results indicated that children at the opera- 
tional level performed significantly better in terms of syntax comprehension than 
children classified as nonoperat ional ."29 Later, DeAvila interprets Trenaine's 
findings as follows: "What this means is that solutions which focus on English 
language deficits will be of limited success as long as developmental factors are 
not taken into account. "^^ 

Studies have focused on several of these complex relationships. One of these 
studies (DeAvila, et al., 1979) examined the relationship between the degree of 
bilingualism (relative linguistic proficiency in English and Spanish), level of 
intellectual development (cognition), and performance on two tests of cognitive- 
perceptual functioning or field dependence/ independence. 

DeAvila concluded that: "In terms of educational implications, the most 
accurate and least value-laden interpretation of the findings would be to 
conclude that there seems to be a positive interaction between relative 
linguistic proficiency and cognitive/perceptual functioning." 



7 

iO 



In summary, G. Richard Tucker ( 1979) of -.he Center for Apolied Linguiscics 
:aakes the following cormencs, wich which zhe NCES/ORA agrees: 

Nor, in ny opinion, have we managed zo devise appropriable 
and valid instruments to assess language proficiency. ',Vhar. 
does It nean to know and to be able to communicate effect- 
ively and acceptably in a language? Does there exist some 
necessary (measur eab le) threshold of target language pro- 
ficiency which must be attained before one is able to' profit 
from instruction in that language? Obviously a great deal 
of additional interdisciplinary researcn is needed to exa- 
mine the effects of factors such as intellectual potential, 
social status, physical or emotional development age of 
entry, presence of native speakers, community stereotypes, 
teacher characteristics, classroom techniques, sequencing* 
of languages, and social setting on the desirability and 
efficacy of bilingual education programs. I remain opti- 
mistic that the proposed Center for Bilingual Research say 
begin to move us in the right direction. 32 

Troike (1980) suggests that the effect of the density of a specific language 
minority group upon language proficiency in L. or is an additional factor 
to add to Tucker's list which deserves additional research attention. 

NCES/ORA cannot determine the effect of the r,ognitive components in the LM&AI 
on the test scores based upon the information we now have from discussions 
with experts in language development and assessment, and a review of pertinent 
literature. A post hoc study of the cognitive component could be completed 
using a sample of subjects from the population which was used for rhe cali- 
bration of the LM&AI. This would be at aa additional cost to the Government. 
However, the quality of the results of such a study would probably not warrant 
the cost yince tests of language proficiency are general ly' confounded with 
language and other factors. 

Recommendation NCES/ORA recommends that NIE state in the final CESS report 
the caveats found in our discussion of this issue. There are clearly limi- 
tations to the CESS results which are a function of the current state-of-the- 
art in the assessment of language proficiency. 




11 



ISSUE: Wera Che cucoff (cricical) scores for the LMSAI, 
which were decermined and used co classify children as 
either English proficient or of li:niced English pro- 
ficiency, sec properly? 

Discussion The our pose of Che LM&AI was co provide a mechanisni for cace- 
^'orizing a child" as" being either English proficienc or limited Englisn 
proficienc. Therefore, the cricical score decernined for each age-level tesc 
of Che LM&AI is essencial for che decerrainacion of valid LE? councs. The 
cricical score was chac score uhich besc dif ferenciaced LEP children from rluenc 
English-soeaking (FES) children who were clearly proficing from inscruccion m 
English, "as an example, if che cricical score on each age-level tesc is loverec 
by two iceas , che estimated count of L£? children decreases rrora 2.41 million to 
2.13 million children, or a decrease of 280,000. Similarly, if the critical 
score for each age-level test is raised by two items, the estimated count of LZ? 
children is increased from 2.il million to 2.62 million children, or an^ increase 
of 210,000. Thus, a score difference of four items has the effect of altering 
che count by nearly one-half million. 

The NCES/ORA requested the raw data on student scores from Field Test III, which 
were used to determine the critical scores for the LM&AI, from the prime contractor 
L. Miranda and Associates, Inc. Based upon an examination of these raw dac^a and 
a comparison of these findings with table A-4 of che HIE Draft Report on the CESS, 
a discrepancy in the data of table A-4 was discovered. This discrepancy was called 
CO the attention of the prime contractor. Ms. Miranda replied that the procedure 
used for -etermining the critical scores, based on a discriminant runction analy- 
sis, was a modification (Grand Mean - Constant = Cutoff) of the more conventional 
approach and resulted in a more conservative estimate of the number of limited 
English proficient children (see Miranda, 1980; p. 6 for further information). 
However, the data in table A-4 did not reflect this conservative approach, lo 
remedy this situation, Ms. Miranda has submitted a revised table A-4 for inclusion 
in the final NIE report on CESS. A copy of the table is in appendix C. 

In developing the LM&AI, five techniques were proposed as alternatives for 
detennining the critical scores. The fivt techniques'^-* are summarized below: 



(1) For each age-level test determine the score which (on Field 
Test II data) was one standard deviation below the mean score 
for the FES (Fluent English Speakers) group of that age 

(2) Simil.irly, use that score which was one standard deviation 
above the mean score for the LESA (later revised to LE?) 
group of each age 



ERIC 



) 9 

/ ' ' 



(3) 



Use the highest Field Test III LESA (LEP) score made by any 
individual on each age group test 



Plot the smv^, of LESA and FES separately and select the score 
equivalent lo the point of intersection of the two distributions 



(.5) 



Use discriminant function analysis (DFA) which considers sub- 
scores to determine a cennroid, which can act as the critical 
point • 



After examining the "accuracy" of the various alternatives, DFA was chosen 
as the method for determining the critical scores. 

While NC2S/0RA fully endorses the use of DFA, there remains an issue regarding 
its use which concerns us. This concerns the application of DFA without concern 
for the differential "costs" of misclassif ication . DFA is a very powerful tool 
in ttiaZ it minimizes the total proportion of the sample which i^: misclassif ied. 
However^ if the resulting classification criteria (critical scdres) consistently 
misclassify one subgroup (e.g., LEP) at the expense of the other, a serious bias 
may result* More explicitly, if there are actually Nj LEP children and N2 English 
proficient (fluent) children among the ^ " ^2 ^^"^^^^^^^ non-English language 

background households, then the cutoff score will lead to an unbiased classification 
procedure if and only if » Pr (Classified LEP (Actually fluent) = * Pr(Classi- 
fied fluent | Actually LEP), That is, the expected number of fluent children mis- 
classified as LEP must equal the expected number of LEP children misclassif ied as 
fluent • 

In defense of the procedures used, since Nj and were not known in advance, 
minimizing the overall misclassif ication error matces reasonable sense. However, 
as can be seen in table A, the actual discrimination procedure used was much 
more likely to misclassify LEP children than fluent children. This explains 
why the critical scores for DFA seemed low.24 

Table A presents the estimated conditional probabilities of correct and 
incorrect classifications by the LM&AI for the critical scores found in the 
revised table A-4 (appendix C). 



EKLC 



10 ■ 

13 



Table A; Estimated conditional probabilities of correct and Incorrect 
Classifications by the LM&AI (See appendix C revised table A-4) 



Age 


P 1 
11 


P 2 
IZ 


P 3 
21 


P ^ 
22 


5 


0.311 


0.000 


0.189 


1 .000 


6 


0.795 


0.000 


0.205 


1.000 


7 


0.306 


0.000 


0.194 


1.000 


8 


0.893 


0.000 ■ 


0.107 


1.000 


9 


0.313 


0.000 


0.138 


1 .000 


10 


0.333 


0.000 


0.167 


1.000 


11 


0.682 


0.000 


0.318 


1.000 


12 


0.864 


0.132 


0.136 


0.318 


13 


0.800 


0.000 


0.200 


1.000 


14 


0.879 


0,204 


0.121 


0.795 



ERIC 



^Pjj = Pr(Classified L£P(ActuaUy LEP). 

2pj2 = Pr(Classified LEPjActually Fluent). 

^^21 ^ Pr(Classified Fluent I Actually LEP). 

^^22^ Pr(Classified Fluent I Actual ly Fluent). 



11 

11 



The bias svidenc in cable A led NCSS/OEIA co conclude that the cricical 

scores for aach age level cesc of the LM&AI should be revised in order to 
ramove the escimaced bias> once we have comouced esciaacas of and N^* 
The mechanise by which this can be done follows 

Let Pij , > ^21 * ^2^ defined as they are found 

in taole A. Let Hj and [fj actual number of L£P 

and fluent children, respectively. Finally^ let L and 
F be the expected number of LZP and fluent children, 
respectively, as estiinatsd by the LM&AI. Then, 



L « N ^ + N ^ 



F « N P + N P 
""l 21 2 22 



Sol 



ving for and we gee 



^ - (1-^22 " "l2^/^^l'22 - ^2^21^ 



^2 ^^^11 ^"^21 ^^^^11 "^22 ^12 21 



Of course, the values of P^^ , P^^ , P^j , and P^^ are functions of the actual 
critical scores which are used ror separating lE? from fluent childreno This 
means that an iterative procedure must be used to determine the unbiased esti- 
mates of Nj snd ^2 based on critical scores associated with "balanced'* mis~ 
classification errors- To accomplish this, the estimated "misc lassif ication 
balance", defined by | L • Pj^ - F • P^jl , must be calculated for each possible 
critical score. For each age group, the critical score is selected which mini- 
mizes che estimated misclassif ication imbalance. Using the expected number 
of LEP (L) and fluent (F) children and the revised probabilities (?'s) once 
the expected misclassification imbalance has been minimized , we can approxi- 
mate the "unbiased" values of and for each age group. The values of 
L> F, Pjj , P22 > P21 and P22 which were used to Compute Nj and N2 are found 
in appendix D. 

The results of these computations (shown in table B) clearly demonstrate the 
consistent bias in the LM&AI classification procedure. The CESS/LM&AI LZP 
counts underestimate the "true" values at ev^ry age, except for ages 12 and 14. 



EKLC 



12 " 

is 



Table 3: Effect on 1978 CSSS LEP counts of removing the estimated bias 



Ase 



1978 CESS LE? Count 



"Unbiased" LE? ^unc 



local 



2,408,375 



2 ,621 ,33: 



5 


192,297 


249,734 


6 


291,622 


306,970 


7 


275,924 


320,774 


3 


257,807 


2 77 ,422 


9 


167,304 


189 ,277 


10 


294,156 


329,047 


11 


190,064 


266,706 


12 


251,680 


207,383 


13 


196,577 


227,732 


14 


291,444 


246,282 



In actual practice, the LEP counts determined by the critical score will almost 
always differ from the "uabiased" estimate, since all children with a given score 
must fall on one side of the critical score or the other. Therefore, we must ac- 
cept some bias in our counts, but NCES/OEIA has minimized the expected bias by using 
the procedure just de.«Cribed« Table C contains the CESS Draft Report -critical score, 
(with the slight modification mentioned earlier), the revised critical scores, and 
the resulting LEP count for each age level. Note that the national LEP figure of 
2,631,075 (table C) compares to an "unbiased" estimate of 2,621,332 (table B). 



Table C: Revised critical scores and resulting LEP counts 



AGE CESS Draft Reoort critical score Revised critical score Revised LEP cou: 



Total 






2,631,075 


5 


18.5 


25.5 


■254,657 


6 


26.5 


29.5 


303,584 


7 


39.5 


44.5 


318.470 


8 


38.5 


40.5 


230,256 


9 


43.5 


46.5 


188,187 


10 


49. S 


52.5 


330,979 


U 


41.5 


51.5 


271,435 


12 


46.5 


44.5 


208,426 


13 


48.5 


52.5 


229,986 


14 


52.5 


49.5 


245,045 



ERIC 



13 

16 



3y •air:-niiziag the ascimanad bias, a Uss conservacive , yec aiora analycically 
soared, LEP count rasulcs wich a change in Che National CESS esuiuiace from 
2,4C'i.375 CO 2,631,075 LZ? children. This change represents a National in- 
cres5B of 9.22 oercent in the number of LZ? children estiniated in the CESS 
Draf Report. 

Recc.Tjnendation MCES/ORA recommends than the NIZ final report on CESS 
reflect this analysis and the revised L£? counts found in table C, 



EKLC 



i 7 



ISSUE-:" What were the effects of non-response bias on 
the counts and estimate of the aumber of LE? children? ^ • 

Discussion The question to be addressed is whether nonrespondents are sisii- 
lar to or different from respondents to the study. There is ao evidence in 
the NIE Draft Report of November 1979 to indicate that nonresponse bias was 
empirically investigated. * 

Dr, Donald Rogers, Vice President for Operations with Resource Development 
Institute (one of the subcontractors) completed and forwarded to the author 
a brier paper in response to AI/CESS. In his paper, Dr. Rogers presents: 
"The results of a very, very .^rlmple analysis of the effects of nonresponse 
during the CESS study, "38 x copy of Dr. Rogers' paper is ia appendix 3, 

Dr. Rogers' stated in a letter that accompanied his paper that: 

My assumptions [appendi:< 3 ] generated a weighted LESA 
[LEP] total that fell within the 95 percent confidence 
interval for the total weighted U,S, LESA [LEP] count re,- 
ported by the CESS study. I do not believe that a study 
of nonrespondents will greatly increase or decrease the 
total, weighted U.S, LESA [LEP] count.37 

Recommendation NCES/ORA concurs with Dr. Rogers' position that further 
investigations of nonresponse bias associated with the 1978 CESS are 
not warranted. 




15 



is 



MOTES 



1, Dubois, 1980; p. 1, 

2, Ibid,, pp, 2-3, 

3, Sterling and Weinkam, 1979; ?. 2, 

4, Ibid., p. 12, 

5 , Troike (1980) suggests chac a ras catsmenc of chis issue should aoc dec race 
attencion from che face chac chere is a dearch of aac , cher^fors, a need for 
basic research on che quescion of which cypes of ir.enzs are appropriace for 
language assessment and measuremenc ac each age l:vel. For example, it is 
necessary Co examine che range of grammatical or samancic variacions which 
are tolerable for each test item at each age level. Only after examining 
this question and others, says Troike, can we hope to be confident of ob- 
taining reliable and valid measures of language proficiency. 

6, Miranda, 1980; p. 1. 

7, Ibid, 

8, 0*Malley, 1980; p. 2. 

9, Ibid. 

10, Ibid., p. 1. 

1 1 , Ibid , , p . 2 , 

12, Ibid. 

13, Dubois, 1980; p. 6. 

14, O'Malley, 1980; p. 3. 

15, DeAvila and Duncan, 1976; p. 9. 

16, Ibid. 

17, Ibid. 

18, Dubois, 1980; p, 7, * 

19, Cummins, 1979; p. 227. 

20, Dubois, 1980; p. 6. 



EKLC 



16 



21. Miranda, 1980; p. 2. 

22. DeAvila, Duncan, Ulibarri, and Fl-'ning, June 1979; p. 8, 

23. Ibid,, p. 50 • 

24. Cazden, 1972; p. 226, 

25. Ibid, 

26. Cazden, 1972; pp. 230-232 . DeA^ .)uncan, Ulibarri, and Fleming, 
June 1979; p. 51. 

27. Cummins, ^979; p. 229. 

28. DeAvila, Duncan, Ulibarri, and Flemings, June 1979; p. 53. 

29. Ibid. 

30 . Ibid . 

31. Ibid., p. 38. 

32. Tucker, 1979; p. 75. 

33. Miranda, 1979; p. 38. 

34. Ibid., p. 43. 

35. The analyses presented here were developed by Dr. Rolf M, Wulfsberg, 
Assistant Administrator for Research and Analysis, NCES. 

36. Rogers, 1980; appendix B, p. 1, 

37. Ibid., p, 1. 



EKLC 



17 

20 



Bib liography 



Cazdan, Courcnay 3, Child Langua - ^ and Educacion , :iew York: 
Hole, Rinshart: and /Mas ton, Lie, 1972 , 

Cummins, James, "Lingu:iscic Interdependence and the Educational 

Development of 3il.z:::gual Children/* Review of Educational Research ^ 
vol, 49, no. 2, Spring 1979, pages 222-251, 

DeAvila, Edward A, and Saaron E. Duncan, "A Few Thoughts About Language 
Assessment: The Lau Decision Reconsidered," Larkspur, California: 
DeAvila, Duncan & Assoc, June 17 , 1976 , 33 pages (photocopy), 

DeAvila, Edward and Sharon Duncan, Relative Linguistic Proficiency 
and Field Dependence/Independence: Some Findings on Linguistic 
Heterogenxety and Cognitive Style of Bilingual Children. Austin, 
Texas: Southwest Educational Development Laboratory, February 1979, 

DeAvila, Edward A,, Sharon E, Duncan, Daniel M, Ulibarri, James S, Fleming. 

Predicting the Success of Language Minority Students from Developmental , 
Cognitive Style, Linguistic ^ and Teacher Perception Measures , Austin, 
Texas: Southwest Educational Development Laboratory, June 1979, 

Dubois, David D, "Analytical Issues Regarding the Children's English and 
Services Study /' Washington, D,C, : U,S, Department of Education, 
JJational Center for Education Statistics, January 21, 1980, 16 pages 
(photocopy) , 

Edmonds, Marilyn H. "New Directions in Theories of Language Acquisition," 
Harvard Educational Review , vol, 46, no, 2, May 1976, page 175, 

Lambert, . E, "Culture and Language as Factors in Learning and Education," 
Education of Immigrant Students (A, Wolfgang, Ed,) Toronto: 
Ontario Institute for Studies in Education, 1975, 

Martinez, Jose, "Response to Issues on LM&AI," Response to AI/CESS with 

cover letter dated February 22, 1980 at Sacramento, California, 5 pages, 

Miranda, Lourdes, CHILDREN'S ENGLISH AND SERVICES STUDY: Technical Report 
on the LM&AI , Bethesda, Maryland: L, Miranda and Associates, Inc , , 
September 10, 1979, 44 pages, 

Miranda, Lourdes and Associates, "Response to the Office of Research and 
Analysis of the National Center for Education Statistics' Inquiry on 
Three Analytical Issues Associated with the 1978 Children's English 
and Services Study," Bethesda, Maryland: L, Miranda and Associates, 
Inc., February 15, 1980, 10 pages. 



EKLC 



19 

21 



0^' Mai ley, J. Michaels Language Minority Children with Limited English 
Proficiency in the United States, Spring, 1973 (Draft KeporcJ. 
Washington, D.C.: National Institute of Education, 1979. 

O'Malle •, Jc Michael. Response to AI/CESS in the form of a letter dated 
January 31, 1930 at • Al exandria , Virginia, 4 pagesi 

Peng, Samuel S» Response to Al/CESS in the for:n of a letter dated February 
12, 1980 at Rockville, Maryland, 5 pages. 

Rogers, Donald D« Response to AI/CESS in the for:ii of a letter and attachment 
dated February 3, 1980 at Austin, Texas, 5 pages. 

Rowlett, Karen. Response to AI/CESS in the forzi of a letter dated February 3, 
1980 at Austin, Texas, 3 pages. 

Silverman, Leslie J. Response to AI/CESS in the form of an (HEW) Education 
Division memorandum dated February 7, 1980 at Washington, D.C. 

Sterling, T. D,, J. J. Weinkam. "What Happens When Major Errors are 
Dirzovered Long After An Important Report Has Been Published?" 
Pausr presented to the American Statistical Association Annual 
Meeting in "Washington, D.C. on August 16, 1979. Burnaby, British 
Columbia: Simon Fraser University, 1979, 13 pages (photocopy). 

Tremaine, R. V. Syntax and Piagetian Operational Thought . Washington, D.C: 
Georgetown University Press, 1975. 

Troike, Rudolph C. Personal communications with Dr- David D. Dubois, May 30, 
1980. 

Truex, Katny. Response to AI/CESS in the form of an (KEW) Office of the 
Secretary memorandum dated March 18, 1980 at Washington, D.C. 

Tucker, G. Richard. "Bilingual Education: Some Perplexing Observations." 
Educational Evaluation and Policy Analysis , vol ■ 1 , no. 5 , September- 
October, 1979, pp. 74-75. 

Whorf, B. L. Lang^uage, thought and reality: Selected Writings of Benjamin 
Lee whorf. (J- B. Carroll, Ed.) Cambridge: M.I.T. Press, 1956. 



APPENDIXES 



23 

o 

ERIC 



A: Analytical Issues Rggarding the 



Cuildren's English and Services Study 



21 

o 

ERIC 



.■iuVALiTICAL ISSUES REGARDIMG THE 
CHILDREN'S ENGLISH MK) SERVICES STIT)Y 

Prepared by 
Dr. David D. Dubois, Policy Analyst 
Office of Research and Analysis 
National Center for Education Statistics 
(January 21, 1980) 

Introduction 

The 1978 Child: .as English and Services Study (CESS) was recently completed 
under contract from the National Institute of Education (NIE) , vith shared 
support from the National Center for Education Statistics (NCES) and the 
U.S. Office of Education (USOE) . The final project report will be published 
by NIE. The principal objective of the CESS was to objectively determine an 
estimate of the number of limited English proficient (LE?) children between 
the ages of 5 and 14 in the United States. 

The purpose of this paper is to present and discuss three analytical issues 
which have been identified as a result of an assessment of the research design, 
data analyses and other information which are described in the 1978 CESS draft 
report of September 6, 1979 (and a later revision dated November, 1979) enti- 
tled "Language Minority Children With Limited English Proficiency in the 
United States: Spring 1978," 

This inquiry is sponsored by the Office of Research and Analysis (ORA) of the 
National Center for Education Statistics (NCES). To date, reviewers have 
included the NCES Assistant Administrator for Research and Analysis, the ORA 
Policy Analyst, and an external consultant from the American Institute for 
Research in the Behavioral Sci ences (AIR) whose services were ob tained under 
contract with the NCES/AIR Statistical Analysis Group in Education. This 
paper is based entirely upon these reviews. 

ERIC 2^^5 



-9- 



Recipients of this paper are invited to respond to the analytical issues. 
Based upon their responses and other information, the Office of Research and 
Analysis will publish a position paper on the resolution of the identified 
analytical issues. 

Objective of the Inouirv 

From an analytical point of view, the 1978 CESS could become a landmark in 
the determination of estimates of the number of L£P children in the United 
States. Th^ CESS estimate of the number of LEP children was accomplished 
directly by developing and administering a domain-referenced content test to 
a sample of children from language minority households in order to assess 
English language skills in speaking, reading, writing, and understanding. 
Prior to 1978, estimates of this type were derived by using surrogate or 
indirect measures. 

It is anticipated that the results of the 1973 CESS will be extensively used 
and frequently cited by U.S. Government officials, members of the U.S. Con- 
gress, and others. At NCES, for example, it is anticipated that the CESS 
data base will be used, with other surrogate measures, to calibrate the 1980 
U.S. Census data in order to determine recent and accurate LEP person counts. 
Additionally, the CESS data base will be a component data base of the NCES 
study to determine projections of the numbers of LEP persons in the U.S. for 
the next 5, 10, 15, and 20 years. 

Since the results of the 1978 CESS are of tremendous importance to present 
and future research studies, bilingual program and policy development, and 
funding for bilingual education, unresolved analytical issues which could 
adversely affect the validity of the results are being stated with the hope 



-3- 

of their resolution. As mentioned earlier, the ORA will publish a position 
paper as a result of this inquiry. In the position paper, we expect to 
provide a technical reply to each issue. Our reply is expected to include 
recoimnendations or suggestions for additional research casks ar.a/or caveats 
to current CESS reports which could, in our opinion, improve the qualitv of 
the products we now have. 

Invitation to Respond 

Recipients of this paper are encouraged to respond to the issues. Respondents 
are assured that their contributions will be carefully considered prior to the 
development and issuance of the ORA position paper. The position paper will 
be released only after each recipient (or his or her designate) has responded 
or has indicated that he or she will not respond to the issues. 

Written replies must be received no later than the close of business, Fridav, 
February 8, 1980. Replies to the issues must be written and should be addressed 
to: 

Dr. David D. Dubois, Policy Analyst 
National Center for Education Statistics 
400 Maryland Avenue, SW, Room 3153 
Washington, DC 20202 
(Telephone: 202-245-8233) 

The persons listed below were designated to receive a copy of this paper. 

Name Agency 

Edward Bryant Westat, Inc. 

Lois-Ellin Datta National Institute of Education 

Karen Dietz/Don Rogers Univ. of Texas-Austin/Resource Development Institute 

Josue M. Gonzales Office of Bilingual Educa:ion, OE 



EKLC 



27 27 



-4- 



Ron Mall 

Ty Hartwell 
Reynaldo Macias 
Jose Martinez 
Lourdes Miranda-Kin<? 
J. Michael O'Malley 
Samuel Peng 
Leslie Silverman 
Kathy Truex 

James Vanecko 

Carl Wisler 



Office of the Assistant: Secretary for Education 
(Policy Development) 

Research Triangle Institute 

National Institute of Education 

California State Department of Education 

L. Miranda and Associates, Inc. 

National Institute of Education 

Westat, I" 

National Center for Education Statistics 

Office of the Assistant Secretary for Planning 
and Evaluation 

Office of the Assistant Secretary for Education 
(Policy Development) 

Office of Evaluation and Dissemination, OE 



History of the 1978 CESS 

The 1978 CESS was developed by NIE through a contract with L. Miranda and 
Associates, Inc. Lourdes Miranda-King was the project director. Dr. J. 
Michael O'Malley was the NIE project officer and Leslie J. Silverman was the 
NCES coordinator. Subcontractors included Westat, Resource Development Insti- 
tute and Research Triangle Institute* 

The primary mission of the 1978 CESS was to objectively determine an estimate 
of the number of LEP children, ages 5-14, inclusive, in the United States. A 
nationally representative sample of households was surveyed during the Spring 
of 1978. Households were identified where a language other than English was 
spoken and where children between the ages of 5 and 14 were living. The Lan- 
guage Measurement and Assessment Inventory (LM(SAI) , a test in English that 
determines whether or not a child is limited in English language proficiency. 



-5- 



vas developed. Selected chilciren from the identified households were 
individually administered - the LM&AI. Specifications for the survey design 
and the Li-l&AI were provided by an advisory group composed of State Education 
Agency representatives in bilingual education, assessment, and daca collection. 

The LM&AI was designed to measure skills in speaking, understanding, reading 
and writing in English. The test is domain-referenced for objectives that 
children at ages 5-14 would be expected to perform in order to profit from 
instruction in an all-English language educational environr.enc. 

Ten separate testo, v>;\e for each age, were developed and used in the survey. 
Reliabilities of .the test for the separate forms range from .86 to .92. As 
a result of preliminary field tests of the LM&AI, a critical score for each 
age test was determined which could be used to classify each child as pro- 
ficient in English or as limited English proficient. 

The contractor provided three cautionary caveats regarding the LM&AI. First, 
the LM&AI was not designed to determine placement or diagnosis with indivi- 
dual children in educational settings. Second, the instrument was designed 
in a manner that resulted in an unknown level of cultural bias. Third, the 
LM&AI items are not "pure" measures of English language proficiency; some of 
the items assess English language proficiency-, memory and cognitive ability. 

We understand that the final NIE report of the 1978 CESS is scheduled for 
publication in the immediate future. 

Statement and Discussion of the Issues 

Three analytical issues are presented and discussed. They are: 

1. Were the items which were selected for inclusion in the Language 

Measurement and Assessment Inventory (LM&AI) selected properly? 
Q 29 29 

ERIC 



2. were che cutoff scores for the LM^I, which were deterrained and 
used to classify children as either English proficient or of limited 
English proficiency, set properly? 

3, What were the effects of non-response bias on the counts and escii^ates 
of the number of LE? children? 

If the first question is answered negatively, then the value of the entire 
1978 CESS is brought into question. In the event that it is answered affir- 
matively, then a negative answer to the second question would imply the need 
for further analyses of the CESS data ™ and possibly the collection of addi- 
tional data — in order to re-compute the cutoff scores. The issue raised by 
the third question could be empirically investigated in the event that it was 
decided to collect the additional data described earlier, 

A detailed discussion of each issue follows, 

ISSUE i Were the items which were selected for inclusion 
in the Language Measurement and Assessment Inventory 
(IJ-lciAl) selected properly? 

i?il£H2£i2Ji* ^^^'^ age--Ievel instrument of the LM&AI consisted of a 
set of itec^H than could be scored so that a high score would indicate that the 
child was proficient in English while a low score would indicate that the child 
was limited English proficient. Therefore, the issue can be rephrased in the 
following manner: . Is English language proficiency the dimension on which the 
scores vary, or are other dimensions associated with variation in the scores? 
More specifically: 

• Are the test scores related to language dominance? 

» Are the test scores related to general language development? 



30 



-7- 



The question of language dominance is addressed in che projecu Draft: Report 
(November, 1979): 

English should be the exclusive criterion irrespective 
of the child's proficiency in the non-English language. 
Thus, language dominance was considered irrelevant: zo 
the discussion* (Page II-3) 

This objective of the study is subject to question on the basis t:hat: for 
bilingual education policy development, a child dominant language might 
affect the potential benefits from participation in a bilingual education 
program* The reader is cautioned that this review does not attempt to opera- 
tionally define the phrase "bilingual education program'* and that this emission 
was intentional. Whether it is appropriate to assess English language profi- 
ciency, ignoring the child's proficiency in .another language, remains a policy 
question to the addressed. 

Are the scores on the test related to general language development? The 

project Draft Report (November, 1979) states that: 

• • • items on the test are not "pure" measures of 
English language proficiency* In some cases, the items 
assess English language proficiency, memory, and cogni- 
tive ability. The intermingling of the potentially 
disparate constructs was intentional to give the items 
as much validity for representing important school tasks 
as possible, (Page A-10) 

Any test so developed could also differentiate between two children with 
equal English language proficiency, giving a higher score to the child with 
greater memory and/or cognitive abilities. It could be argued, therefore, that 
the test development procedures should have excluded items not primarily 
associated with English language proficiency. The types of items selected for 
the test (Draft Report; November, 1979; Table A-1) appear to be generally 
assessing relevant content. There is, however, a component of general cogni- 
tive development, not merely English language development. 



-8- 



The choice of items for the LbiL^,l vas a funccion of a field test. I tens 

were selected that best differentiated betveen tvo criterion groups. The 

project Draft Report (November, 1979) states: 

The test was being developed to differentiate language 
minorities who were limited in English proficiency from 
those who could profit from instruction in English. 
Items under development were to be field tested with 
two clearly defined criterion groups : (a) limited Eng- 
list proficient children; and (b) fluent English speaking 
children who were clearly profiting from instruction in 
English. (Page II-6) 

The test w£is clearly being prepared for administration to language minority 

children. The dimension being tested is essentially the dimension on which 

those two groups differ most. It could be argued that the two groups differed 

on native language as well as English language proficiency and, therefore, the 

test scores could be expected to have a partial language dominance loading. A 

potential solution to this problem would be to equate the tvo criterion groups 

on proficiency in a non-English language. This would make the test independent 

of language dominance. 

ISSUE: P/ere the cutoff scores for the Ll^SiAI , which were 
determined and used to classify children as either English 
proficient or of limited English proficiency ^ set properly? 

Discussion . The purpose of the LM&AI was to provide a mechanism for 

making a dichotomous assignment of a child as being either English proficient or 

limited English proficient. Therefore, the cutoff score which was chosen for 

each age-level test of the LM&AI is critical, for the determination of valid 

counts^ As an example, if the cutoff score on each age-level test is lowered 

by two items , the estimated count of LEP children decreases from 2.41 million 

to 2,13 million children, or a decrease of 280,000. Similarly, if the cutoff 

score for each age-level test is raised by two items, the estimated count of LE? 

children is increased from 2.41 million to 2.62 million children, or an increase 

O 32 

ERIC 32 



of 210,000, Thus, a score difference of four irens has zhe effect of altering 
the count by nearly one-half million. 

Recall that the cutoff score was that score which best differentiated LE? 
children froa fluent English-speaking (FES) children vho were clearly orofitiin^ 
from instruction in English. 

In developing the LMStAI, five techniques were proposed as alternatives for 
determining the cutoff scores. The five tcichniques are summarized on page 38 
of the CHILDREN'S ENGLISH AND SERVICES STUDY: Technical Report on the LM&AI 
(L, Miranda and Associates, Inc., September 10, 1979): 

(1) For .each age-level test determine the score which (on Field 
Test II data) was one standard deviation below the mean score 
for the FES group of that age, 

(2) Similarly, use that score which was one standard deviation 
above the mean score for the LESA (later revised to LE?) group 
of each age, 

(3) Use the highest Field Test III LESA score made by any 
individual on each age group test. 

(4) Plot the scores of LESA and FES separately and select the 
score equivalent to the point of intersection of the two 
distributions, 

(5) Use discriminant function analysis (DFA) which considers 
subscores to determine a centroid point, which can act as the 
critical point. 

After examining the "accuracy'' of the various alternatives, DFA was chosen as 
the method for determining the cutoff scores. OTiile this Office fully endorses 
this choice, there remain three subissues which still bother us,^ 

First, the above excerpt from the Technical Report implies that subscores were 
used in the DFA. If this is so, several events must have happened. 

^The analyses found here were developed by Dr, Rolf M. Wulfsberg, the Assistant 
Administrator for Research and Analysis at NOES, 



33 33 



-10- 



ERIC 



(1) The subsccres would be transfomed into a new rotal score 
representing a linear combinacion of the subscores. This 
new score would be real-valued (as opposed to integer- 
valued) and it would be conceivable — in fact, hiahlv 
likely — that relative scores between two individuals 
could be reversed. That is, if individual A had a higher 

. original score than individual 3, che revised DFA score 
for A could easily be lover tharx thaC of 3 due to differ- 
ential weighting of the subscores. Since no scores on the 
final CESS tape are non-integer-valued, and since no rever- 
sal or the kind discussed above occurred, one can only 
assume that subscores were, in fact, not used in the DFA. 

(2) The relative weighting of the items, which was carefully 
designed, would be totally revised by the differential 
weighting of the DFA procedure. This is another reason 
that this Office doubts that subscores were used. 

The second subissue concerns the application of DFA without concern for the 
differential "costs" of misclassif ication. DFA is a very powerful tool in that 
it minimizes the total proportion of the sample which is misclassif ied , However, 
if the resulting classification criteria (cut scores) consistently misclassify 
one subgroup (e.g., LEP) at the expense of the other, a serious bias may result. 
More explicitly, if there are actually LE? children and English proficient 
(fluent) children among the N = + children of non-English language back- 
ground households, then the cutoff score will lead to an unbiased classification 
procedure if and only if N^.Pr (Classified LE?/Actually fluent) = .Pr (Classi- 
fied fluent/Actually LEP). That is, the expected number of fluent children mis- 
classified as LEP must equal the expected number of LEP children misclassif ied 
as fluent. 

i 

In defense of the procedures used, since and N2 were not known a priori, 
minimizing the overall misclassif ication error makes reasonable sense. However, 
as can be seen in Table A, the actual discrimination procedure used was much 
more likely to misclassify LEP children than fluent children. This explains why 
the cut scores for DFA seemed low (see page 43 of the aforementioned Technical 
Report ) . 



34 



34 



-11- 



Table A: EstiiaHted Condiciorxal Prcbabilitiies of Correcr 
and Incorrscn Classifications by nhe Ui^Al 



Age 


11 


Pli. 

12 


21 


22 


5 


0.892 


0.000 


0. 108 


1.000 


6 


0.955 


0.037 


0.045 


0.963 


7 


0.889 


0.000 


0.111 


1.000 


8 


0.929 


0.000 


0.071 


1.000 


9 


0.906 


0.000 


0.094 


1.000 


10 


0.944 


0.000 


0.056 


1.000 


11 


0.795 


0.000 


0.205 


1.000 


12 


0.864 


0.182 


0.136 


0.818 


13 


0.880 


0.000 


0.120 


1.000 


14 


0.879 


0.204 


0.121 


0.796 



il 

Pii = Pr(Classified LEP I Actually LEP). 

ii 

Pl2 = Pr(Classified LEPIActuallv Fluent), 

3j 

P21 = Pr(Classified Fluent I Actually LEP) 

il 

P22 = Pr (Classified Fluent (Actually Fluent). 



ERIC 



35 

35 



The evident bias described above raises the third subissue: Should the cutoff 
scores be revised to remove the estinated bias after the fact (when ve have 
estimates of Nj^ and N^) ? This Office tends to feel that this should be done. 
The mechanism by which this could be done is described below. 

Let ^117 P]^2' ^21' ^^'^ ^'^^ defined as in Table A. Let and N-^ be the 

actual number of LEP and fluent children, respectively/. Finally, let L and ? 

be the expected number of LEP and fluent children, respectively, estimated by 

the LM&AI. Then, 

L = N^P^^ + N^P,^ (1) 
F - N^P2j, + N;P2; (2). 

Solving for Nj, and N^, we get 

N, = (L?„ - FP,7)/(P, ,P29 - P,^P^,)and 
^2 = (^^n - ^^2P/^^11^22 " ^l?2l)- 
By using the actual CESS estimates for L and F, we can then approximate the 

unbiased values of Nj, and N2 for each age group. The results, which are showTi 

in Table B, clearly demonstrate the consistent bias in the LM&AI classification 

procedure. The CESS/U-SSAI LEP counts underestimate the "true" values at every 

age, except for age 14. 



36 

36 



-13- 



[able 3: Efface on LZ? 



on LE? Counts of Removing Zsziziazed Bias 



AGE 


CESS LE? COU^TT 


"UNBIASED" LE? cou:rr 




192,297 


215,580 


6 


29 1 ,622 


301,767 


7 


27^,924 


310,375 


S- 


25/ ,80/ 


277,510 


9 


167 , 304 


184,662 


J. u 


294, 156 


311,506 


11 


190,064 


239,074 


12 


251,680 


262,412 


13 


196,577 


223,383 


14 


291,444 


284,766 


Total 


2,408,875 


2,611,135 



ERIC 



If we accept the new LEP counts as more realistic estimates of the true values, 
then we can adjust the cut scores to reflect these new counts by raising (except 
for age 14) the cut scores until the proper number of children have been classi- 
fied as LEP. In reality, this point will (almost) always fall in the middle of a 
cell (score) , so one can choose the cut score which will yield the closest esti- 
mate to Nj^, 

In the case which is present in Table C, a different rule was used. Since there 
is an abnormal "roller coaster" effect to the data to begin with in the relation- 
ship between age and percent LEP, the cut score leading to the percentage closest 
to the overall mean percentage was chosen for each age group. That is, the lower 
cut score was generally used for even ages and the higher cut score was generally 
used for odd ages. 

37 

37 



-14- 



Table C: Modified Cut Scores and Resulting LE? Counts 



Age 


Old cut score 


New cut score 


New 
LEP count 

223,327 


— 


18.5 


21.5 


6 


26.5 


28.5 


298,929 


7 


39.5 


43.5 


307,759 


8 


38.5 


39.5 


268,830 


9 


43.5 


46.5 


188,187 


10 


49.5 


50.5 


310,860 


11 


41.5 


45.5 


246,921 


12 


46.5 


46.5 


251.680 


13 


48.5 


51.5 


223,785 


14 


52.5 


52.5 


291,444 


Total 






2,611,722 



The relationships among the CESS/LM&AI estimates, the unbiased estimates, and 
the adjusted CESS estimates are evident in Chart 1. Chart 1 shows the percent 
of each age cohort for each of the estimates of the number of limited English 
proficient children. 



38 



erIc 



100 



Chart 1: I'crcGiit of Encli Age Cohort Estini.itoil to he LEP 




-15- 



The procedure described above should go a long vay toward removing the bias in 
Che LM&AI. Of course, the values of P.^ used in the derivation are conditioned 
on the original cut scores used by the LM&AI. With the modified cut scores, the 
Pij's vould change (as do the new L and r counts shown in Table C) , so that the 
results could still- change slightly. (This is because the LM&AI sample of 35:. 
fluent children and 337 LEP children are not necessarily representative of their 
. respective populations.) The Office of Research and Analysis is attempting to 
obtain the original data which were used to determine the cutoff scores on the 
LM&AI from L, Miranda and Associates, Inc. in order to explore this Issue.. 

ISSUE: What were the effects of non-response hias on the 
counts and estimate of the number of LEP children? 

Discussion. In survey research of this type, the potential effects 
of non-response bias are a reality. The question to be addressed is whether 
non-respondents are similar to or different from respondents, to the study. 

Response rates by regional subpopulations (New York, Texas, California, 
remainder of the U.S.) for the household screener, questionnaire and for the 
administration of the LM&AI are presented in Table III-l of the Draft Report 
(November, 1979). From the table it can be determined that the response rates, 
totaled over all subpopulations, were: household screener, 76.2%; household 
questionnaire, 93.8%; LM&AI administration, 84.6%. Response rates were derived 
by using the formula 

Response Rate = Total Number Completed ^qq^ 
Total Number Eligible 

There is no evidence in the Draft Report (November, 1979) to indicate that 

non-response bias was empirically investigated. Although adjusting weights by 

poststratlfication is a customary practice, it can be argued that this is not 



ERIC il 



-16- 



necessarily a satisfactory substitute for empirically investigating differences 
between respondents and non-respondents. 

In the event that the first issue stated herein is answered in the affinnacive 
and, additionally, a decision is made to collect additional data for racalibra- 
tion of the LMiStAI, an empirical investigation of non-response bias can be under- 
taken concurrently. 

In summary, ORA reviewers believe that these issues can be resolved and, 
accordingly. Chat the study can be retained by cooperative responsible action. 



EKLC 



41 



42 



3: Rogers' Paper, Nonresponse Analysis 



43 



ERIC 



AUTHOR: Dr. Donald Rogers 

Resource Development: Institute 
Aus t in , Texas 

ATTACHMENT A: Nonresoonse .Analvsis 



Purpose 

The purpose of this paper is to present the results of a very, very 
simple analysis oftheeffects of nonresponse during the CESS study. 

General Procedure 

The general procedure was to assume that nonresponding "SCR incomplete; 
probable ineligible households (Code 8 h ouseholds) " had characteristics 
that were significantly different from responding households. The im- 
pact of this assumption was then determined by reweighting the data and 
recomputing NELB and LZSA counts. 



Limitations 

The analysis reported here uses average weights. Ideally, each stratum 
is considered individually. However, the resources required for a stratum- 
by-stratura analysis were not available. Therefore average weights were used 
because they were easy to compute. This means that the results of this 
analysis only indicate or suggest the type of results that would be obtained 
by a sophisticated analysis. 

References 

This paper is based on the information presented in RDI's final CESS 
reports. Data have been taken from Section 8 (Data Analysis Procedures) 
and Section 9 (Results) of Volume I. Weighting formulae are taken from 
Appendix 6.6 of Volume II. The reader must have these reports to follow 
this paper. For example, the definitions of variables are presented in 
Appendix 6.6 and are not repeated here. 



Assumptions 

The following assumptions were made to assess the effects of nonresponse: 

1. All Category 8 households complete the SCR. 

2. The percentage of Category 8 households "that are eligible 
and complete the HHQ is twice as great as the percentage 
of Category 1, 2, and 3 households. 

3. All of the eligible Category 8 households complete the 
HHO . 




4, The average number sauipled per eligible household is the 
same for Category 8 households . 

5, The average number of completed LM&AI per household is 
the same for Category 8 households . 

6, The average number of LESA children per household is the 
same for Category 8 households. 



The effects of these assumptions on the "raw" data is presented in the 
following tables . 



Household Number Percent Number 

Codes Complete SCR Complete KHQ Complete HHQ 



1,-,-: 25,358 6,5 ' 1,652 

8 5>790 13.0 753 

Totals 31,148 2,405 



Household 
Codes 



Number 
Complete 
HHQ 



Average 
Number 
Sampled 
Per 
Household 



Number 
SamDled 



Average Number 
of Completed 

LM&AI 
Per Household 



Number 
of Completed 
LM^SAI 



1,2,5.3 
8 

Totals 



1,652 
753 
2,405 



1-78 
1-78 



2,953 
1,340 
4,293 



1.16 
1-15 



1,909 
873 
2,782 



Household Number of 

Codes Completed LM.^AI Percent LESA Number LESA 



1,2,5.3 1,909 71.24 1,360 

8 873 71.24 622 

Totals 2,782 1,982 



ERLC 



46 

45 



Changes in Values 



The raw data were used to compute average values for the variables in 
Appendix 6.6. The computed values are presented in the table below. 
The formulae have been omitted because they appear in Appendix 6.6, 
Although the use of the symbols is not entirely appropriate and is not 
precisely consistent with the definitions presented in Appendix 6,6, 
the results are presented in this uianner to isake it easy for the reader 
to follow the calculations. 





Average or Estimated Value 


Average or Estimated Value 
After Assumotions 


r, . . 


33,283 


33,283 


r/. . 


25,358 


31,148 


W.' . . 


64.5 


64.5 




2,146,753 


2,146,753 


ns 


1,635,591 


2,009,046 


hij 


83.9 


63.9 




1,762 


2,515 




1,652 


2,405 


M. 

ns 


147,832 


173,284 




138,603 


165,704 


W (2) 


89.5 


72.1 


ZO'. . 
■hijra 


3,084,452 


3,048,452 


hi J III I 


2,953 


4,293 



ERIC 



47 46 





Average or Estimated Value 


Average or Estimated Value 


Variable 


Before Assumptions 


After Assumptions 




34,061 


49,573. 




1,032 


833 


C." . 


1,909 


2,782 


N. 
ns 


3,047,496 


3,571,776 


n; 

ns 


1,970,088 


2,314,624 


hijm 


J- , J > / 




Q( adjust- 






ed for SIE) 


1,997 


1,370 


Total NELB 


3,811,850 


3,811,850 



Analysis 

The assunipcions abouc che Cacegor-/ 8 households increased the sampled 
nunber of NELBs from 1,909 to 2,782, This is approxiniacely a 46;^ increase. 
However, because of che weighting procedures, this increase has n£ 
meaningful effect on the total U,S, estimates. 

The assumption about the Category 8 households increased the sarr.pled 
number of LESAs from 1,360 to 1,982. This is approximately a 46% 
increase. The effects on the total U,S. estimate depends upon as- 
sumptions about how these cases are weighted. The table presented 
below reports the average weights that have been used to this point in 
the analysis. 



Assumut ion 



TvDe 



Number 



Ave rase Weight 



U,S, Estimate 



Before 
After 
Before 
After 



NELB 
NEL3 
LESA 
LESA 



1,909 
2,782 
1,360 
1,982 



1,997 
l-,370 
1,771 
Unknown 



3,811,850 
3,811,850. 
2,403,908 
Unknown 



48 



This cable indicates chat before the assumptions, the average LESA 
weight is less than the average NEL3 weight. The assumptions that 
have been made should not affect this relationship, and the average 
LESa weight should continue to be less than the average NEL3 weight. 
However, to test response bias, assume the NEL3 and LESA average 
weights are the same after the assumptions and are equal to 1,370. 
This yields a total U.S. estimate of 2,715,340 LESAs. This estimate 
is 306,432 LESAs greater than the LESA estimate reported by the CESS 
study. However, an estimate of 2,715,340 LESAs is within the 95% 
confidence interval of the total U.S. LESA estimate reported by the 
CESS study. 



The analysis that has been reported here is rather simple and super- 
ficial. Some of the assumptions that have been made border on being 
outrageous. Neverthess, the results of the analysis indicate thae 
these assumptions do not create meaningful differences in the final 
estimates. 



Conclusion 



49 




C; Classification Errors in Selection of a 

Criterion Score on the Language Measurement 
and Assessment Inventory (Table A-4) 



49 



REVISED 



I 



TABLE A-^ 

Classification Krrors in Selection of a Criterion Score on the Language 

Measurement, and Assessment Inventory 





i roEiciency m 


Proficiency in 








Englisli on 


English on 


Cri tical 


•Percent , 


AGE 


the Predictor 


the Cr 


iterioa^ 


Score 


Accuracy 






Fluent 


Li?3ited 






5 


Fluent 


32 


0 














1.9 


90.0 




Limi L.eu 


7 


30 






6 


Fluent 


27 


0 














26 


37. 3 




Limi ted 


9 


35 






7 


Fluent 


31 


0 














39 


89.6 






7 


29 






8 


Fluent 


36 


0 














39 


95.3 




Limi t ed 


3 


25 






9 


Fluent 


35 


0 














43 


91.0 




Limited 


6 


26 






10 


Fluent 


35 


0 














49 


91.5 




Limited 


6 


30 






11 


Fluent 


34 


0 














41 


82. 1 




Limited 


14 


30 






12 


Fluent 


27 


6 














47 


83.6 




Limited 


3 


19 






13 


Fluent 


42 


0 














48 


92.5 




Limited 


5 


20 






14 


Fluent 


39 


10 














52 


82.9 




Limited 




29 






a. 


Entries are number of 


cases in field test three. 






b. 


For example, percent correct at age 


5 equals 100 


(32+30)769=90. 0. 





ERIC 



53 OO 



Values of L, F, Pj^ , ?^ , Pjj , » 

N2 for the Minimized Misclassifications 

LEP Children by Age Cohort 



51 



Appendix D . Values 

Minimized 



s of L, F, Pjj , . ?2i . ?22 . and ^2 for the 
ized Misclassif icacions of LEP Children by Age C 



Cohort 



Age Cohort 



11 



'12 



21 



^22 



254657 



73213 1.000 0.063 0.000 0.937 249734 



78136 



303584 



90989 0.963 0.091 0.037 0.909 306970 



87603 



318470 144466 0.968 0.056 0.032 0.944 320774 142162 



280256 



37083 1.000 0.071 0.000 0.929 277422 



39917 



188187 143003 0.971 0.031 0.029 0.969 189277 141913 



10 



330979 



32565 1.000 0.056 0.000 0.944 329047 



34497 



11 



271485 132731 0.971 0.091 0.029 0.909 266706 137510 



12 



208426 191107 0.879 0.136 0.121 0.864 207388 192145 



13 
14 



229986 



94240 0.976 0.080 0.024 0.920 227732 



96494 



245045 241365 0.877 0.121 0.123 0.879 246282 240128 



ERIC 



57 

52 



E: 



Contributors 



53 

o 

ERIC 



CONTRIBUTORS 



This report was completed while the author was assigned to the National Center 
for Education Statistics (NCSS) as an Education Policy Fellow sponsored by the 
Institute for Educational Leadership, The George Washington University and 
Dr. Rolf M« Wulfsberg, the Assistant Administrator for Research and Analysis 
at NCES. The author gratefully acknowledges Dr. Wulfsberg*s encouragement 
and support during the development and completion of this project. 

Dr. Roslyn A. Korb (Technical Planning Officer, NCES) and Dr. Donald McLaughlin 
(Statistical Analysis Group in Education, NCES and the American Institutes for 
Research) provided technical opinions on the Children's English and Services 
Study (CESS) in the fonaative stage of this methodological review. 

Dr. Edward A. DeAvila (Linguametrics Group) and Dr. Rudolph C. Troike (Director, 
National Clearinghouse for Bilingual Education) provided psychometric and lin- 
guf.stic perceptions tin language development, measurement, and assessment. 

Dr. Rebecca Oxford (InterAraerica Research Associates, Inc.) shared several refer' 
ences on bilingual education with the author. 

Those individuals who responded to the issues paper (appendix A) provided the 
author with historical information on the development of the CESS which was 
essential for obtaining an in-depth understanding of the study. 

Finally, Mr. Richard Haber (Division of Multilevel Education Statistics, NCSS) 
completed several programming tasks for this project. 



